Skip to content

fix(assistants): reconcile MCP server set on every turn#3013

Open
danielkov wants to merge 3 commits into
mainfrom
daniel/age-2516-bug-newly-added-mcp-servers-not-appearing-in-assistants
Open

fix(assistants): reconcile MCP server set on every turn#3013
danielkov wants to merge 3 commits into
mainfrom
daniel/age-2516-bug-newly-added-mcp-servers-not-appearing-in-assistants

Conversation

@danielkov
Copy link
Copy Markdown
Contributor

Summary

  • Server resolves the assistant's current MCP set on every /turn and sends it in the runtime dispatch body.
  • Runner forwards the list to a new McpCmd::Reconcile: registers + connects newly attached servers (surfacing assistant_mcp_auth_required notices through the thread inbox on AuthRequired), disconnects servers no longer in scope.
  • mcp_force_reconnect's server_id enum is dropped — it was frozen at bootstrap and went stale the moment reconciliation landed.

Previously a newly attached integration (e.g. GitHub MCP) stayed invisible to the running assistant until the runtime was restarted, leaving the model unable to use it or to invoke mcp_force_reconnect for it.

Closes AGE-2516

✻ Clauded...

danielkov added 2 commits May 22, 2026 23:59
The runner only saw the MCP server list its initial bootstrap returned;
toolset edits made after that point (e.g. attaching GitHub MCP) never
reached a live runtime and the model could not list, call, or
mcp_force_reconnect those servers.

Server now resolves the current MCP set from the assistant's toolsets on
each /turn and includes it in the dispatch body. The runner forwards it
to a new McpCmd::Reconcile against a per-thread known set: new servers
get registered + connected (with assistant_mcp_auth_required notices
injected via the thread inbox on AuthRequired), removed servers get
disconnected. The mcp_force_reconnect tool's server_id schema enum is
dropped — it was frozen at bootstrap and lied as soon as reconciliation
landed.
@danielkov danielkov requested a review from a team as a code owner May 22, 2026 22:59
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 22, 2026

AGE-2516

@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gram-docs-redirect Ready Ready Preview, Comment May 22, 2026 11:11pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 22, 2026

🦋 Changeset detected

Latest commit: a83ee4f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
server Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added the preview Spawn a preview environment label May 22, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d231e6e8a9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +577 to +580
if let Err(err) = manager.disconnect_server(&server_uid).await {
tracing::debug!(server_id = %id, error = %err, "reconcile disconnect (ignored)");
}
ctx.known.remove(&id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove detached MCP servers from reconnectable state

When a server is removed during reconcile, this code only calls disconnect_server and then deletes the ID from ctx.known, but it never removes the server config from McpServerManager. Because ForceReconnect later calls connect_server by ID against manager configs, a previously detached server can be reconnected and exposed again if the model/user reuses that stale ID. Since that ID is no longer in ctx.known, future reconciles also won’t clean it up, so detached integrations can persist unexpectedly.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a83ee4f: known now records only servers that connected (or that surfaced an auth notice). Transient transport failures stay out of known and the next /turn reconcile retries the connect.

Comment thread agents/runner/src/runtime.rs Outdated
Comment on lines +548 to +550
ctx.known.insert(server.id.clone());
let server_uid = McpServerId::new(server.id.clone());
if let Err(err) = connect_and_log(manager, &server_uid, "reconcile_add").await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retry failed MCP additions on later reconciles

New servers are marked as known before connection succeeds, so a transient non-auth failure in connect_and_log permanently suppresses future auto-connect attempts: subsequent reconciles skip the server due to ctx.known.contains(...). This means a newly attached integration can remain unavailable for the lifetime of the thread unless manually force-reconnected, which defeats the per-turn reconciliation behavior under flaky network/startup conditions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a83ee4f: ForceReconnect now rejects any server_id that isn't in ctx.known. agentkit-mcp has no unregister API so the config does linger, but the guard means a stale id can't be resurrected via mcp_force_reconnect.

@speakeasybot
Copy link
Copy Markdown
Collaborator

speakeasybot commented May 22, 2026

🚀 Preview Environment (PR #3013)

Preview URL: https://pr-3013.dev.getgram.ai

Component Status Details Updated (UTC)
✅ Database Ready Existing database reused 2026-05-22 23:16:04.
✅ Images Available Container images ready 2026-05-22 23:15:47.

Gram Preview Bot

Address Codex review feedback on PR #3013:

- Insert into `ctx.known` only after a server connects cleanly or returns
  AuthRequired. A transient transport failure now leaves the entry out
  of `known` so the next /turn reconcile retries instead of suppressing
  the integration for the rest of the thread.
- Gate `McpCmd::ForceReconnect` on `known`. agentkit-mcp has no
  unregister API, so a detached server's config lingers in the manager;
  without this guard a stale id could be resurrected by the model
  invoking `mcp_force_reconnect` against a server the user has already
  removed.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a83ee4f894

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +564 to +565
ctx.known.insert(server.id.clone());
if let Some(notice) = create_auth_notice(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retry auth-required servers when auth notice creation fails

In reconcile_servers, the auth-required path marks the server as known before the auth notice is successfully created and delivered. If create_mcp_auth_flow fails transiently (or the inbox send fails), subsequent reconciles will skip this server because of the ctx.known guard, so no later turn can re-emit assistant_mcp_auth_required and the integration can remain unusable for the thread.

Useful? React with 👍 / 👎.

Comment on lines +599 to +602
if let Err(err) = manager.disconnect_server(&server_uid).await {
tracing::debug!(server_id = %id, error = %err, "reconcile disconnect (ignored)");
}
ctx.known.remove(&id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve detach retry when MCP disconnect fails

The removal loop drops the server ID from ctx.known even when disconnect_server returns an error. In failure cases this prevents future reconciles from retrying the detach (the ID is no longer tracked in known), so a server that failed to disconnect can linger in stale state for the lifetime of the thread.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Spawn a preview environment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants